Structured Solution Methods for Non-Markovian Decision Processes
نویسندگان
چکیده
Markov Decision Processes (MDPs), currently a popular method for modeling and solving decision theoretic planning problems, are limited by the Markovian assumption: rewards and dynamics depend on the current state only, and not on previous history. Non-Markovian decision processes (NMDPs) can also be defined, but then the more tractable solution techniques developed for MDP’s cannot be directly applied. In this paper, we show how an NMDP, in which temporal logic is used to specify history dependence, can be automatically converted into an equivalent MDP by adding appropriate temporal variables. The resulting MDP can be represented in a structured fashion and solved using structured policy construction methods. In many cases, this offers significant computational advantagesover previous proposals for solving NMDPs.
منابع مشابه
Properties of Planning with Non-Markovian Rewards
We examine technologies designed to solve decision processes with non-Markovian rewards (NMRDPs). More specifically, target decision processes exhibit Markovian dynamics, called grounded dynamics, and desirable behaviours are modelled as state trajectories specified in a temporal logic. Each technology operates by automatically translating NMRDPs into corresponding equivalent MDPs amenable to c...
متن کاملStructured Sohtion Methods for
Markov Decision Processes (MDPs), currently a popular method for modeling and solving decision theoretic planning problems, are limited by the Markovian assumption: rewards and dynamics depend on the current state only, and not on previous history. Non-Markovian decision processes (NMDPs) can also be defined, but then the more tractable solution techniques developed for MDP’s cannot be directly...
متن کاملDecision-Theoretic Planning with non-Markovian Rewards
A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decisiontheoretic planning, where many desirable behaviours are more naturally expressed as properties of execution sequences rather than as properties of states, NMRDPs form a more natural model than the commonly adopted fully Markovi...
متن کاملImplementation and Comparison of Solution Methods for Decision Processes with Non-Markovian Rewards
This paper examines a number of solution meth ods for decision processes with non-Markovian rewards (NMRDPs). Tlu::y all t:xploit a temporal logic specification of the reward function to au tomatically translate the NMROP into an equiv alent Markov decision process (MOP) amenable to well-known MOP solution methods. They dif fer however in the representation of the target MOP and the class o...
متن کاملAnytime State-Based Solution Methods for Decision Processes with non-Markovian Rewards
A popular approach to solving a decision process with non-Markovian rewards (NMRDP) is to exploit a compact representation of the reward function to automatically translate the NMRDP into an equivalent Markov decision process (MDP) amenable to our favorite MDP solution method. The contribution of this paper is a representation of non-Markovian reward functions and a translation into MDP aimed a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997